Search CORE

9 research outputs found

Emergence of Language with Multi-Agent Games: Learning to Communicate with Sequence of Symbols

Author: Havrylov Serhii
Titov Ivan
Publication venue
Publication date: 26/04/2017
Field of study

Embedding Words as Distributions with a Bayesian Skip-gram Model

Author: Brazinskas Arthur
Havrylov Serhii
Titov Ivan
Publication venue
Publication date: 01/01/2018
Field of study

We introduce a method for embedding words as probability densities in a low-dimensional space. Rather than assuming that a word embedding is fixed across the entire text collection, as in standard word embedding methods, in our Bayesian model we generate it from a word-specific prior density for each occurrence of a given word. Intuitively, for each word, the prior density encodes the distribution of its potential 'meanings'. These prior densities are conceptually similar to Gaussian embeddings. Interestingly, unlike the Gaussian embeddings, we can also obtain context-specific densities: they encode uncertainty about the sense of a word given its context and correspond to posterior distributions within our model. The context-dependent densities have many potential applications: for example, we show that they can be directly used in the lexical substitution task. We describe an effective estimation method based on the variational autoencoding framework. We also demonstrate that our embeddings achieve competitive results on standard benchmarks.Comment: COLING 2018. For the associated code, see https://github.com/ixlan/BS

arXiv.org e-Print Archive

Edinburgh Research Explorer

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Embedding Words as Distributions with a Bayesian Skip-gram Model

Author: Bražinskas Arthur
Havrylov Serhii
Titov Ivan
Publication venue
Publication date: 10/12/2016
Field of study

Edinburgh Research Explorer

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Author: Havrylov Serhii
Jiao Yunlong
Liu Fangyu
Massiah Jordan
Yilmaz Emine
Publication venue: Theoretical and Applied Linguistics Student
Publication date: 13/03/2022
Field of study

In NLP, a large volume of tasks involve pairwise comparison between two sequences (e.g. sentence similarity and paraphrase identification). Predominantly, two formulations are used for sentence-pair tasks: bi-encoders and cross-encoders. Bi-encoders produce fixed-dimensional sentence representations and are computationally efficient, however, they usually underperform cross-encoders. Cross-encoders can leverage their attention heads to exploit inter-sentence interactions for better performance but they require task fine-tuning and are computationally more expensive. In this paper, we present a completely unsupervised sentence representation model termed as Trans-Encoder that combines the two learning paradigms into an iterative joint framework to simultaneously learn enhanced bi- and cross-encoders. Specifically, on top of a pre-trained Language Model (PLM), we start with converting it to an unsupervised bi-encoder, and then alternate between the bi- and cross-encoder task formulations. In each alternation, one task formulation will produce pseudo-labels which are used as learning signals for the other task formulation. We then propose an extension to conduct such self-distillation approach on multiple PLMs in parallel and use the average of their pseudo-labels for mutual-distillation. Trans-Encoder creates, to the best of our knowledge, the first completely unsupervised cross-encoder and also a state-of-the-art unsupervised bi-encoder for sentence similarity. Both the bi-encoder and cross-encoder formulations of Trans-Encoder outperform recently proposed state-of-the-art unsupervised sentence encoders such as Mirror-BERT and SimCSE by up to 5% on the sentence similarity benchmarks

arXiv.org e-Print Archive

Apollo (Cambridge)